The value iteration algorithm is not strongly polynomial for discounted dynamic programming

نویسندگان

Eugene A. Feinberg

Jefferson Huang

چکیده

This note provides a simple example demonstrating that, if exact computations are allowed, the number of iterations required for the value iteration algorithm to find an optimal policy for discounted dynamic programming problems may grow arbitrarily quickly with the size of the problem. In particular, the number of iterations can be exponential in the number of actions. Thus, unlike policy iterations, the value iteration algorithm is not strongly polynomial for discounted dynamic programming. © 2014 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming

This note shows that the number of arithmetic operations required by any member of a broad class of optimistic policy iteration algorithms to solve a deterministic discounted dynamic programming problem with three states and four actions may grow arbitrarily. Therefore any such algorithm is not strongly polynomial. In particular, the modified policy iteration and λ-policy iteration algorithms a...

متن کامل

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

Recent Progress on the Complexity of Solving Markov Decision Processes

The complexity of algorithms for solving Markov Decision Processes (MDPs) with finite state and action spaces has seen renewed interest in recent years. New strongly polynomial bounds have been obtained for some classical algorithms, while others have been shown to have worst case exponential complexity. In addition, new strongly polynomial algorithms have been developed. We survey these result...

متن کامل

An approximation algorithm and FPTAS for Tardy/Lost minimization with common due dates on a single machine

This paper addresses the Tardy/Lost penalty minimization with common due dates on a single machine. According to this performance measure, if the tardiness of a job exceeds a predefined value, the job will be lost and penalized by a fixed value. Initially, we present a 2-approximation algorithm and examine its worst case ratio bound. Then, a pseudo-polynomial dynamic programming algorithm is de...

متن کامل

Global optimization of mixed-integer polynomial programming problems: A new method based on Grobner Bases theory

Mixed-integer polynomial programming (MIPP) problems are one class of mixed-integer nonlinear programming (MINLP) problems where objective function and constraints are restricted to the polynomial functions. Although the MINLP problem is NP-hard, in special cases such as MIPP problems, an efficient algorithm can be extended to solve it. In this research, we propose an algorit...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

Oper. Res. Lett.

دوره 42 شماره

صفحات -

تاریخ انتشار 2014

The value iteration algorithm is not strongly polynomial for discounted dynamic programming

نویسندگان

چکیده

منابع مشابه

Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming

Accelerated decomposition techniques for large discounted Markov decision processes

Recent Progress on the Complexity of Solving Markov Decision Processes

An approximation algorithm and FPTAS for Tardy/Lost minimization with common due dates on a single machine

Global optimization of mixed-integer polynomial programming problems: A new method based on Grobner Bases theory

عنوان ژورنال:

اشتراک گذاری